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ABSTRACT 

Interactive three dimensional graphics applications, such as 
terrain data representation and manipulation, require 
extensive arithmetic processing. Massively parallel machines 
are attractive for this application since they offer high 
computational rates, and grid connected architectures provide 
a natural mapping for grid based terrain models. This paper 
presents algorithms for data movement on the MPP in support 
of pan and zoom functions over large data grids. It is an 
extension of earlier work that demonstrated real-time 
performance of graphics functions on grids that were equal in 
size to the physical dimensions of the MPP. When the 
dimensions of a data grid exceed the processing array size, data 
is packed in the array memory. Windows of the total data grid 
are interactively selected for processing. Movement of packed 
data is needed to distribute items across the array for efficient 
parallel processing. Execution time for data movement was 
found to exceed that for arithmetic aspects of graphics 
functions. Performance figures are given for routines written 
in MPP Pascal. 

Keywords: interactive graphics, parallel algorithms, MPP, 
terrain models 

INTRODUCTION 

Multiprocessor architectures have been used for several years 
to meet the demanding computational requirements of 
interactive, 3D graphics. The computing resources may take 
the form of specialized hardware that exploits the vector and 
pipeline suitability of graphics problems. (Refs. 4, 5, 7, 9, 
and 13). However, there are several architectures which were 
not designed specifically with graphics applications in mind, 
but are versatile enough to be used advantageously on the 
vectorizable nature of the computations (Refs. 1, 2, 3, 6, and 
11). This paper focuses on the use of one such machine, the 
MPP, for a specific graphics problem: representation of 
digital terrain data and interactive manipulation of 
corresponding terrain images. 

Prior work has shown the feasibility of interactive 
manipulation of stereo pair images of small terrain models on 
the MPP (Ref. 10). In the prior work grids of terrain data 
were 128 by 128 points, exactly matching the dimensions of 
the MPP and leading to a natural mapping of terrain data to the 
processing grid. In order to increase the possible applications, 
it is necessary to implement interactive graphics operations 

* This work was partially supported by NASA Goddard Space 
Flight Center through the MPP Working Group. 


on much larger databases of terrain points. Data structures 
and algorithms reported in this paper are for pan and zoom 
functions on larger databases. The work is more completely 
described in Ref. 8. 

PARALLEL ALGORITHMS 
Data Representation 

Grid-based digital terrain models contain an m by n 
rectangular grid of points (xj, yj) , 1 <i<m , 1<j<n, which 
correspond to longitude and latitude values on the earths 
surface. Each grid point has an associated value zy which is 
the elevation above sea level at the point (xj, yj). 

Typically, a 128 by 128 grid of elevation points is considered 
to match the MPP’s architecture. Elevation points are assigned 
to processing elements (PEs) in a straightforward way with 
PEj t j containing the grid points (xj, yj, zj,j). A grid that just 
matches the PE array size constitutes a small terrain model 
and a limited display. In order to examine a different part of 
the terrain, it is necessary to input a new set of coordinates 
with elevation points from the host or staging memory. We 
wish to make a large database available within the array unit 
at the outset, and be able to display arbitrary parts of the 
terrain in real-time, under interactive selection control. 

The methods described in this paper can handle a terrain 
database up to size 512 by 512 in the limited IK per PE 
memory of the MPP. However, for purposes of illustration, we 
consider a model with a 4 by 4 array of PEs and an 8 by 8 
array of terrain data. That is, there are four data points per 
PE. 

In order to exploit the full parallel capabilities of the MPP it 
is necessary that terrain data points to be processed be spread 
across the available PEs. This will require some movement of 
data within the processing array. A particular storage 
mapping, shown in figures 1 and 2, is chosen because it 
supports the movements used in pan and zoom functions. The 
original 8 by 8 data array, I in figure 1, is reformatted into 
four 4 by 4 subarrays, A, B, C, and D in figure 2. Data 
elements are mapped as follows: 

A = {ay} where ay = l2i, 2ji 

B = {by} where by * l2i, 2j+U 

C = {cyj where cy = l2i+1, 2ji 

D = {dy} where dy = l2i+1, 2j+i: 

where 0<i<3 and 0<j<3. 
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Figure 3. Packed data with a window selected. 


After reformatting, PEjj will contain data points from the 
same position in each of the four data arrays. That is, the 
terrain data points shown in figure 2 as aij, bj i, q j, and 
djj, collectively called tjj t are mapped onto PEjj.’ 
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Figure 2. Reformatted data, as stored in a 4 by 4 PE array. 


Windows 


A subset of terrain points that exactly conforms to the 
dimensions of the MPP is called a window. Figure 3, where a 
square corresponds to a single PE, shows that a window only 
includes data points that are localized to part of the PE array. 
To exploit the parallelism of the machine, it Is necessary to 
spread the subset of points over the entire PE array such that 
each PE has one data point from the window. Figure 4 shows 
the distributed data produced by the "spread" function 
described below. 
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Figure 4. Distribution of the selected window over the array. 


The "Spread'* Function 

This function consists of a series of bit plane data movement 
operations. All statements in this routine are executed 
simultaneously on all PEs. Two data movement masks NSMASK 
and EWMASK are created. MPP Pascal (Ref. 12) primitives 
such as "rotate" and "any" are used in conjunction with these 
data planes to route the selected data to the target PE. The 
"spread" routine is invoked four times so that elements of A, 
B, C, and D enclosed by the window can be moved to their target 
positions, one array at a time. Parameter SOURCE is the 
particular source array; either A, B, C, or D. Parameter 
INDEX is the position of the upper left corner of the window 
when it is packed in the array. 


routine SPREAD 

MAKEMASK(A, B, C, D, SOURCE, INDEX, NSMASK, 

EWMASK); 

where (NSMASK * 0) 

rotate EWMASK and SOURCE to positions 
indicated by NSMASK; 
end where; 

where (EWMASK * 0) 

rotate SOURCE to position indicated by 
EWMASK; 
end where; 

end routine. 
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Routine SPREAD calls the following routine, MAKEMASK. 

routine MAKEMASK (A, B, C, D, SOURCE, INDEX, NSMASK, 
EWMASK); 

for PEs outside the selected window 
NSMASK = 0; 

EWMASK = 0; 

SOURCE * 0; 
end for; 

for PEs inside the selected window 
depending on INDEX 

SOURCE* A or Bor Cor D; 
COMPUTEMASK (NSMASK, EWMASK); 
end for; 

Routine MAKEMASK calls routine COMPUTEMASK. 

routine COMPUTEMASK (NSMASK, EWMASK); 

If tj,j in PEjj needs to be moved to PE Xf y 
NSMASK = x - i; 

EWMASK = y - j; 

end if; 

end routine. — 


Once data has been spread over the entire array, further 
functions such as intensity calculations, hidden surface 
removal, and rendering, can be executed in parallel with each 
PE handling one data point as in Ref. 10. 


Pan and Zoom Functions 

We are interested in the ability to move the window about in 
the array of terrain point sets. This process of moving a 
window in object space is called "panning”. The routine that 
follows provides the pan function as simply a selection of the 
window to be spread. This routine is used prior to computation 
of intensities and image rendering. 


routine PAN ( ORIGIN); 

choose window based on user-defined ORIGIN; | 

SPREAD; 1 

end routine. 

Another means of examining the entire database is to sacrifice 
resolution for extent of coverage. By choosing representative 
data points from the database it is possible to zoom-in or 
zoom-out using the routine below. The spacing between 
adjacent chosen points determines the resolution or extent of 
zoom. For our small example there are only two zoom settings. 
Maximum resolution is achieved by choosing a window and 
carrying out a SPREAD. For minimum resolution, it is 
possible to simply select one of the four arrays A, B, C, or D. 
With greater terrain data packing factors, intermediate levels 
of resolution,, involving different extents of data movement, 
are possible. 

We note again that once data has been spread oyer the entire 
array, further functions such as intensity calculations, hidden 
surface removal, and rendering, can be executed in parallel. 


routine ZOOM (ORIGIN, RESOLUTION, INDEX); 
if RESOLUTION -MAX 

choose window based on user-defined ORIGIN; 
SPREAD; 

end if; 

if RESOLUTION * MIN 

depending on INDEX 
FINAL * A, or B, or C, or D; 

end if; 

end routine. — 


TIMING ANALYSIS 

Graphics programs were written in MPP Pascal (Ref. 11) for 
a database of 256 by 256 terrain points. Execution time can be 
determined using system provided timing routines. The 
structure of a typical graphics program loop with pan and 
zoom operations is to distribute the data points from the 
selected window over the entire array, then compute 
intensities of all pixels in parallel, then render the image. 
Table 1 gives actual timing measurements 


distribute data 

2283 milliseconds 

compute intensities 

5 milliseconds 

render image 

819 milliseconds 


Table 1 . Measured timing for a 256 by 256 database. 


The time taken to distribute data is almost entirely accounted 
for by 28 calls to the SPREAD routine. It is invoked seven 
times for each of the four arrays A, B, C, and D discussed 
earlier. Moreover, the time taken to execute SPREAD once is 
almost entirely accounted for by an inner loop which uses the 
MPP Pascal ‘'rotate” function extensively. The measured time 
for one "rotate" is 204 microseconds. The time taken just for 
executing "rotate” functions while distributing data is 2172 
milliseconds. SPREAD is also used in rendering an image and 
contributes greatly to its execution time. 

An equivalent to the "rotate” function can be achieved in lower 
level languages of the MPP in 3.3 microseconds, rather than 
the 204. Table 2 is derived from the measured timing by 
substituting the much lower rotate time. 


distribute data 

142 milliseconds 

compute intensities 

5 milliseconds 

render image 

53 milliseconds 


Table 2. Expected timing with efficient "rotate". 


A necessary condition for real-time graphics image generation 
is that one pass through the .loop of the program must take no 
more than 33 milliseconds. Even with the expected time table, 
each pass takes 200 milliseconds, yielding only five frames 
per second. 

An alternative approach is to bypass data distribution in favor 
of iteratively using a smaller portion of the processing array. 
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Image generation time using this approach, for the same 
database as above, was measured at 73 milliseconds. Execution 
time is reduced but full parallelism of the array is not used. As 
the number of data points in the database is increased by a 
factor of K, the number of active PEs in the array unit is 
decreased by K. This will result in a factor of K increase in 
time for the intensity computation alone. 

A third approach is to maintain the database in the staging 
memory and bring in only the data points needed for each 
computation. Image generation time reduces to 59 
milliseconds. However, data in the staging memory is only 
accessible along certain predefined boundaries. This 
complicates pan and zoom functions. 

CONCLUSION 

Prior work has shown massive parallelism to be suitable for 
graphics applications on data arrays which fit the processing 
array size. When larger arrays must be handled, the time 
involved in moving data becomes the dominant part of the 
problem and can take the performance out of the real-time 
realm. 
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